Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A new PHP JIT implementation based on IR JIT framework #12079

Merged
merged 259 commits into from
Oct 23, 2023
Merged

Conversation

dstogov
Copy link
Member

@dstogov dstogov commented Aug 29, 2023

This PR provides a new JIT implementation based on IR - Lightweight
JIT Compilation Framework
.

Despite of the PHP 8.* JIT approach, that generates native code directly from
PHP byte-code, this implementation generates intermediate representation (IR)
and delegates all lower-level tasks to the IR Framework. IR for JIT is like an
AST for compiler.

Key benefits of the new JIT implementation:

  • Usage of IR opens possibilities for better optimization and register
    allocation (the resulting native code is more efficient)
  • PHP doesn't have to care about most low-level details (different CPUs,
    calling conventions, TLS details, etc)
  • it's much easier to implement support for new targets (e.g. RISCV)
  • IR framework is going to be developed separately from PHP and may accept
    contributions from other projects (new optimizations, improvements, bug fixes)

Disadvantages:

  • JIT compilation becomes slower (this is almost invisible for tracing
    JIT, but function JIT compilation of Wordpress becomes 4 times slower)

The necessary part of the IR Framework is embedded into php-src. So, the PR doesn't introduce new dependencies.

The new JIT implementation successfully passes all CI workflows, including
nightly, but it's still not mature and may cause failures.
To reduce risks, this patch doesn't remove the old JIT implementation (that
is the same as PHP-8.3 JIT). It's possible to build PHP with the old JIT by
configuring with --disable-opcache-jit-ir.
In the future the old implementation should be removed.

ext/opcache/jit/README-IR.md Outdated Show resolved Hide resolved
.gitmodules Outdated Show resolved Hide resolved
run-tests.php Outdated Show resolved Hide resolved
Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all thanks for working on this new IR JIT.
I have some small comments with the help of SA. I only made a quick look so far.
Probably most of these are just me misunderstanding the code though.

EDIT: github bugged out and put the comments at the wrong lines because of the large diff...
EDIT 2: actually, it looks fine if you click "Files Changed" in the menu above. It's just in the comments here that it shows at the wrong line.

ext/opcache/jit/zend_jit_ir.c Outdated Show resolved Hide resolved
ext/opcache/jit/zend_jit_ir.c Outdated Show resolved Hide resolved
ext/opcache/jit/zend_jit_ir.c Outdated Show resolved Hide resolved
ext/opcache/jit/zend_jit_ir.c Outdated Show resolved Hide resolved
ext/opcache/jit/zend_jit_ir.c Outdated Show resolved Hide resolved
@dstogov
Copy link
Member Author

dstogov commented Sep 11, 2023

@nielsdos thank you for the review. Only one of your comment was wrong, and the rest should be already fixed.

@dstogov
Copy link
Member Author

dstogov commented Sep 12, 2023

I wonder if there would be an interest to be able to export/save the IR (CLI method or similar)?

It's already possible to get the textual representation of the IR through -d opcache.jit_debug bits. e.g.

  • -d opcache.jit_debug=0x1000000 - IR created by PHP
  • -d opcache.jit_debug=0x2000000 - IR after all optimizations

This text may be loaded and processed without PHP, but generated code can't be executed, because the text includes hardcoded addresses of PHP API and helper functions.

That could open the door to be able to generate other targets using IR (not possible now for various limitations but wasm could be a very nice candidate to begin with).

This is one of the purposes of the IR project and the reason to develop it separately from PHP.
Adding a new IR backend should be easier then PHP one.

@bukka
Copy link
Member

bukka commented Sep 12, 2023

I haven't checked it much but sounds really promising. Think V8 is doing something similar so seems like the way forward.

I have got just one objection and that's usage of git submodules? They are cool for initial development especially if done by one person but I think they just suck if used for the bigger project with more contributors and will require updates for everyone building PHP from source (small but unnecessary). I also think it should be consistent with other bundled libs that are just included. Also when you look to other projects like nodejs, it includes sources in the repo. As I said I understand that it's easier for development but it complicates things for everyone else. In case we really want to use git submodules, it should be discussed separately and it will likely require RFC because I don't think there will be any agreement.

@dstogov
Copy link
Member Author

dstogov commented Sep 12, 2023

I haven't checked it much but sounds really promising. Think V8 is doing something similar so seems like the way forward.

Yes. V8 uses very similar IR and compilation pipe-line.

I have got just one objection and that's usage of git submodules? They are cool for initial development especially if done by one person but I think they just suck if used for the bigger project with more contributors and will require updates for everyone building PHP from source (small but unnecessary). I also think it should be consistent with other bundled libs that are just included. Also when you look to other projects like nodejs, it includes sources in the repo. As I said I understand that it's easier for development but it complicates things for everyone else. In case we really want to use git submodules, it should be discussed separately and it will likely require RFC because I don't think there will be any agreement.

I completely agree, usage of submodules should be the common decision and they definitely will make troubles.
It's easier to embed the necessary files.

@derickr do you manage libdate updates with some tools or manually?

@homersimpsons
Copy link

Would it make sense to re-use an existing JIT compiler project such as https://github.com/bytecodealliance/wasmtime/tree/main/cranelift over re-creating our own dedicated to php ?

I do not know a lot in this field but I think there are already some existing JIT compiler we could plug into.

@dstogov
Copy link
Member Author

dstogov commented Sep 13, 2023

Would it make sense to re-use an existing JIT compiler project such as https://github.com/bytecodealliance/wasmtime/tree/main/cranelift over re-creating our own dedicated to php ?

I do not know a lot in this field but I think there are already some existing JIT compiler we could plug into.

I already wrote LLVM based JIT, analysed usage of mir, libfim and few other JIT libraries.

Finally, I came to decision of creating a new PHP independent IR JIT framework mostly based on ideas bothered from HotSpot and LuaJIT. Combination of the ideas and practical approach allowed to make it much simple and significantly faster (PHP function JIT produces 15M of optimized native code per second, LLVM based PHP JIT did the same work more than a minute). This framework may be used for other languages (I have plans to write a C parser) and at the same time implements support for many PHP VM specific things.

Copy link
Contributor

@withinboredom withinboredom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty awesome. I just came to see how the sausage is made, though I don't understand everything, I like it.

Random suggestion would be to add a couple of attributes to hint to JIT what to do:

  • #[Jit\Hot] or something to hint that we want it always JIT'd because we expect it to be called very often.
  • #[Jit\Never] might be a useful escape hatch from JIT-specific bugs until a bugfix can be released.

ext/opcache/config.w32 Show resolved Hide resolved
}
}
}
if (need_move) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This if-statement is a bit hard to follow with the dangling else at the bottom. Might I suggest doing something like:

if(!need_move) {
  ra[i].flags |= ZREG_PHI;
  continue;
}

// do things in if-statement

It'd reduce some of the nesting and make it a bit easier to read.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. Following if/else should be easier than less-structural continue.
Anyway, this is just a coding preference.

}
} else if (ra[src].ref) {
ra[src].flags |= ZREG_STORE;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to below, I think we can add a continue here and flatten this loop out a bit.

Comment on lines +2731 to +2739
if (!ra[src].ref) {
ra[i].flags |= ZREG_LOAD;
} else {
ra[i].flags |= ZREG_PI;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something like this would work:

ra[i].flags |= ra[src].ref ? ZREG_PI : ZREG_LOAD;

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This probably makes sense.

int k, src;

if (phi->pi >= 0) {
src = phi->sources[0];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still wrapping my head around this, but is it possible for src to be a non-existent index on ra?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No.
ra[] is an array of size ssa->vars_count.
src as a SSA variable index, that must be in range [0..ssa->vars_count-1].

/* Remove useless register allocation */
for (i = 0; i < ssa->vars_count; i++) {
if (ra[i].ref &&
((ra[i].flags & ZREG_LOAD) ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From above, if ra[i].ref is true, then ra[i].flags will always be ZREG_LOAD, maybe? Actually, not always. Looks like it's conditional, but leaving this comment here for now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of this loop is avoiding register allocation for variables that are not used directly but have to be loaded and stored at the same memory location.

@derickr
Copy link
Member

derickr commented Sep 15, 2023

@derickr do you manage libdate updates with some tools or manually?

I merge it by hand. There are only very few changes, so I use Meld to visually compare and merge changes over.

@dstogov dstogov merged commit caf102d into php:master Oct 23, 2023
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.